Central Places in Wikipedia
نویسنده
چکیده
Central Place Theory explains the number and locations of cities, towns, and villages based on principles of market areas, transportation, and socio-political interactions between settlements. It assumes a hexagonal segmentation of space, where every central place is surrounded by six lower-order settlements in its range, to which it caters its goods and services. In reality, this ideal hexagonal model is often skewed based on varying population densities, locations of natural features and resources, and other factors. In this paper, we propose an approach that extracts the structure around a central place and its range from the link structure on the Web. Using a corpus of georeferenced documents from the English language edition of Wikipedia, we combine weighted links between places and semantic annotations to compute the convex hull of a central place, marking its range. We compare the results obtained to the structures predicted by Central Place Theory, demonstrating that the Web and its hyperlink structure can indeed be used to infer spatial structures in the real world. We demonstrate our approach for the four largest metropolitan areas in the United States, namely New York City, Los Angeles, Chicago, and Houston.
منابع مشابه
Is Small More Interesting? Examining Countries' GeoNames Places linked to Wikipedia
Following up on previous analyses [1], we examine the geospatial and thematic data in GeoNames [2]. It is the largest freely available gazetteer – a geographical thesaurus – with a worldwide coverage. One measure of interestingness for a country can be its number of populated places represented as pages in Wikipedia. In GeoNames, on average only 20% of populated places, i.e., cities, towns, vil...
متن کاملبهبود شناسایی موجودیتهای نامدار فارسی با استفاده از کسره اضافه
Named entity recognition is a process in which the people’s names, name of places (cities, countries, seas, etc.) and organizations (public and private companies, international institutions, etc.), date, currency and percentages in a text are identified. Named entity recognition plays an important role in many NLP tasks such as semantic role labeling, question answering, summarization, machine ...
متن کاملExtracting Named Entities and Relating Them over Time Based on Wikipedia
This paper presents an approach to mining information relating people, places, organizations and events extracted from Wikipedia and linking them on a time scale. The approach consists of two phases: (1) identifying relevant pages categorizing the articles as containing people, places or organizations; (2) generating timeline linking named entities and extracting events and their time frame. We...
متن کاملGeographic information extraction using natural language processing in Wikipedia texts
Geographic information extracted from texts is a valuable source of location data about documents, which can be used to improve information retrieval and document indexing. Linked Data and digital gazetteers provide a large amount of data that can support the recognition of places mentioned in text. Natural Language Processing techniques, which have evolved significantly over the last years, of...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2015